Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 39.791
Filtrar
1.
NPJ Syst Biol Appl ; 10(1): 34, 2024 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-38565568

RESUMO

Minimal Cut Sets (MCSs) identify sets of reactions which, when removed from a metabolic network, disable certain cellular functions. The traditional search for MCSs within genome-scale metabolic models (GSMMs) targets cellular growth, identifies reaction sets resulting in a lethal phenotype if disrupted, and retrieves a list of corresponding gene, mRNA, or enzyme targets. Using the dual link between MCSs and Elementary Flux Modes (EFMs), our logic programming-based tool aspefm was able to compute MCSs of any size from GSMMs in acceptable run times. The tool demonstrated better performance when computing large-sized MCSs than the mixed-integer linear programming methods. We applied the new MCSs methodology to a medically-relevant consortium model of two cross-feeding bacteria, Staphylococcus aureus and Pseudomonas aeruginosa. aspefm constraints were used to bias the computation of MCSs toward exchanged metabolites that could complement lethal phenotypes in individual species. We found that interspecies metabolite exchanges could play an essential role in rescuing single-species growth, for instance inosine could complement lethal reaction knock-outs in the purine synthesis, glycolysis, and pentose phosphate pathways of both bacteria. Finally, MCSs were used to derive a list of promising enzyme targets for consortium-level therapeutic applications that cannot be circumvented via interspecies metabolite exchange.


Assuntos
Algoritmos , Infecção dos Ferimentos , Humanos , Modelos Biológicos , Redes e Vias Metabólicas/genética , Genoma
2.
Genet Sel Evol ; 56(1): 26, 2024 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-38565986

RESUMO

BACKGROUND: Chinese indigenous sheep are valuable resources with unique features and characteristics. They are distributed across regions with different climates in mainland China; however, few reports have analyzed the environmental adaptability of sheep based on their genome. We examined the variants and signatures of selection involved in adaptation to extreme humidity, altitude, and temperature conditions in 173 sheep genomes from 41 phenotypically and geographically representative Chinese indigenous sheep breeds to characterize the genetic basis underlying environmental adaptation in these populations. RESULTS: Based on the analysis of population structure, we inferred that Chinese indigenous sheep are divided into four groups: Kazakh (KAZ), Mongolian (MON), Tibetan (TIB), and Yunnan (YUN). We also detected a set of candidate genes that are relevant to adaptation to extreme environmental conditions, such as drought-prone regions (TBXT, TG, and HOXA1), high-altitude regions (DYSF, EPAS1, JAZF1, PDGFD, and NF1) and warm-temperature regions (TSHR, ABCD4, and TEX11). Among all these candidate genes, eight ABCD4, CNTN4, DOCK10, LOC105608545, LOC121816479, SEM3A, SVIL, and TSHR overlap between extreme environmental conditions. The TSHR gene shows a strong signature for positive selection in the warm-temperature group and harbors a single nucleotide polymorphism (SNP) missense mutation located between positions 90,600,001 and 90,650,001 on chromosome 7, which leads to a change in the protein structure of TSHR and influences its stability. CONCLUSIONS: Analysis of the signatures of selection uncovered genes that are likely related to environmental adaptation and a SNP missense mutation in the TSHR gene that affects the protein structure and stability. It also provides information on the evolution of the phylogeographic structure of Chinese indigenous sheep populations. These results provide important genetic resources for future breeding studies and new perspectives on how animals can adapt to climate change.


Assuntos
Genoma , Seleção Genética , Ovinos/genética , Animais , China , Análise de Sequência de DNA , Altitude , Polimorfismo de Nucleotídeo Único
3.
Sci Rep ; 14(1): 8073, 2024 04 05.
Artigo em Inglês | MEDLINE | ID: mdl-38580653

RESUMO

The fishing cat, Prionailurus viverrinus, faces a population decline, increasing the importance of maintaining healthy zoo populations. Unfortunately, zoo-managed individuals currently face a high prevalence of transitional cell carcinoma (TCC), a form of bladder cancer. To investigate the genetics of inherited diseases among captive fishing cats, we present a chromosome-scale assembly, generate the pedigree of the zoo-managed population, reaffirm the close genetic relationship with the Asian leopard cat (Prionailurus bengalensis), and identify 7.4 million single nucleotide variants (SNVs) and 23,432 structural variants (SVs) from whole genome sequencing (WGS) data of healthy and TCC cats. Only BRCA2 was found to have a high recurrent number of missense mutations in fishing cats diagnosed with TCC when compared to inherited human cancer risk variants. These new fishing cat genomic resources will aid conservation efforts to improve their genetic fitness and enhance the comparative study of feline genomes.


Assuntos
Carcinoma de Células de Transição , Neoplasias da Bexiga Urinária , Gatos , Animais , Humanos , Genoma/genética , Neoplasias da Bexiga Urinária/patologia , Carcinoma de Células de Transição/patologia , Genômica , Células Germinativas/patologia
4.
Sci Data ; 11(1): 340, 2024 Apr 05.
Artigo em Inglês | MEDLINE | ID: mdl-38580722

RESUMO

Despite the rapid advances in sequencing technology, limited genomic resources are currently available for phytophagous spider mites, which include many important agricultural pests. One of these pests is Tetranychus piercei (McGregor), a serious banana pest in East Asia exhibiting remarkable tolerance to high temperature. In this study, we assembled a high-quality genome of T. piercei using a combination of PacBio long reads and Illumina short reads sequencing. With the assistance of chromatin conformation capture technology, 99.9% of the contigs were anchored into three pseudochromosomes with a total size of 86.02 Mb. Repetitive elements, accounting for 14.16% of this genome (12.20 Mb), are predominantly composed of long-terminal repeats (30.7%). By combining evidence of ab initio prediction, transcripts, and homologous proteins, we annotated 11,881 protein-coding genes. Both the genome and proteins have high BUSCO completeness scores (>94%). This high-quality genome, along with reliable annotation, provides a valuable resource for investigating the high-temperature tolerance of this species and exploring the genomic basis that underlies the host range evolution of spider mites.


Assuntos
Tetranychidae , Animais , Cromossomos , Genoma , Genômica , Anotação de Sequência Molecular , Filogenia , Sequências Repetitivas de Ácido Nucleico , Tetranychidae/genética
6.
BMC Genomics ; 25(1): 349, 2024 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-38589806

RESUMO

The fleece traits are important economic traits of goats. With the reduction of sequencing and genotyping cost and the improvement of related technologies, genomic selection for goats has become possible. The research collect pedigree, phenotype and genotype information of 2299 Inner Mongolia Cashmere goats (IMCGs) individuals. We estimate fixed effects, and compare the estimates of variance components, heritability and genomic predictive ability of fleece traits in IMCGs when using the pedigree based Best Linear Unbiased Prediction (ABLUP), Genomic BLUP (GBLUP) or single-step GBLUP (ssGBLUP). The fleece traits considered are cashmere production (CP), cashmere diameter (CD), cashmere length (CL) and fiber length (FL). It was found that year of production, sex, herd and individual ages had highly significant effects on the four fleece traits (P < 0.01). All of these factors should be considered when the genetic parameters of fleece traits in IMCGs are evaluated. The heritabilities of FL, CL, CP and CD with ABLUP, GBLUP and ssGBLUP methods were 0.26 ~ 0.31, 0.05 ~ 0.08, 0.15 ~ 0.20 and 0.22 ~ 0.28, respectively. Therefore, it can be inferred that the genetic progress of CL is relatively slow. The predictive ability of fleece traits in IMCGs with GBLUP (56.18% to 69.06%) and ssGBLUP methods (66.82% to 73.70%) was significantly higher than that of ABLUP (36.73% to 41.25%). For the ssGBLUP method is significantly (29% ~ 33%) higher than that with ABLUP, and which is slightly (4% ~ 14%) higher than that of GBLUP. The ssGBLUP will be as an superiors method for using genomic selection of fleece traits in Inner Mongolia Cashmere goats.


Assuntos
Genoma , Cabras , Humanos , Animais , Cabras/genética , Genômica/métodos , Fenótipo , Genótipo , Modelos Genéticos
7.
Genome Biol ; 25(1): 101, 2024 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-38641647

RESUMO

Many bioinformatics methods seek to reduce reference bias, but no methods exist to comprehensively measure it. Biastools analyzes and categorizes instances of reference bias. It works in various scenarios: when the donor's variants are known and reads are simulated; when donor variants are known and reads are real; and when variants are unknown and reads are real. Using biastools, we observe that more inclusive graph genomes result in fewer biased sites. We find that end-to-end alignment reduces bias at indels relative to local aligners. Finally, we use biastools to characterize how T2T references improve large-scale bias.


Assuntos
Genoma , Genômica , Genômica/métodos , Biologia Computacional , Mutação INDEL , Viés , Análise de Sequência de DNA/métodos , Software , Sequenciamento de Nucleotídeos em Larga Escala/métodos
8.
Mol Biol Rep ; 51(1): 560, 2024 Apr 20.
Artigo em Inglês | MEDLINE | ID: mdl-38643284

RESUMO

BACKGROUND: Zygotic genome activation (ZGA) is an important event in the early embryo development, and human embryo developmental arrest has been highly correlated with ZGA failure in clinical studies. Although a few studies have linked maternal factors to mammalian ZGA, more studies are needed to fully elucidate the maternal factors that are involved in ZGA. METHODS AND RESULTS: In this study, we utilized published single-cell RNA sequencing data from a Dux-mediated mouse embryonic stem cell to induce a 2-cell-like transition state and selected potential drivers for the transition according to an RNA velocity analysis. CONCLUSIONS: An overlap of potential candidate markers of 2-cell-like-cells identified in this research with markers generated by various data sets suggests that Trim75 is a potential driver of minor ZGA and may recruit EP300 and establish H3K27ac in the gene body of minor ZGA genes, thereby contributing to mammalian preimplantation embryo development.


Assuntos
Regulação da Expressão Gênica no Desenvolvimento , Zigoto , Animais , Humanos , Camundongos , Desenvolvimento Embrionário/genética , Genoma/genética , Embrião de Mamíferos , Mamíferos
9.
Biotechnol J ; 19(4): e2300691, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38622798

RESUMO

CRISPR/Cas9 technology, combined with somatic cell nuclear transplantation (SCNT), represents the primary approach to generating gene-edited pigs. The inefficiency in acquiring gene-edited nuclear donors is attributed to low editing and delivery efficiency, both closely linked to the selection of CRISPR/Cas9 forms. However, there is currently no direct method to evaluate the efficiency of CRISPR/Cas9 editing in porcine genomes. A platform based on fluorescence reporting signals and micropattern arrays was developed in this study, to visually assess the efficiency of gene editing. The optimal specifications for culturing porcine cells, determined by the quantity and state of cells grown on micropattern arrays, were a diameter of 200 µm and a spacing of 150 µm. By visualizing the area of fluorescence loss and measuring the gray value of the micropattern arrays, it was quickly determined that the mRNA form targeting porcine cells exhibited the highest editing efficiency compared to DNA and Ribonucleoprotein (RNP) forms of CRISPR/Cas9. Subsequently, four homozygotes of the ß4GalNT2 gene knockout were successfully obtained through the mRNA form, laying the groundwork for the subsequent generation of gene-edited pigs. This platform facilitates a quick, simple, and effective evaluation of gene knockout efficiency. Additionally, it holds significant potential for swiftly testing novel gene editing tools, assessing delivery methods, and tailoring evaluation platforms for various cell types.


Assuntos
Sistemas CRISPR-Cas , Edição de Genes , Animais , Suínos , Sistemas CRISPR-Cas/genética , Edição de Genes/métodos , Técnicas de Inativação de Genes , Genoma , RNA Mensageiro/genética
10.
Semin Cell Dev Biol ; 161-162: 31-41, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38598944

RESUMO

Antagonistic coevolution, arising from genetic conflict, can drive rapid evolution and biological innovation. Conflict can arise both between organisms and within genomes. This review focuses on budding yeasts as a model system for exploring intra- and inter-genomic genetic conflict, highlighting in particular the 2-micron (2µ) plasmid as a model selfish element. The 2µ is found widely in laboratory strains and industrial isolates of Saccharomyces cerevisiae and has long been known to cause host fitness defects. Nevertheless, the plasmid is frequently ignored in the context of genetic, fitness, and evolution studies. Here, I make a case for further exploring the evolutionary impact of the 2µ plasmid as well as other selfish elements of budding yeasts, discuss recent advances, and, finally, future directions for the field.


Assuntos
Saccharomycetales , Saccharomycetales/genética , Saccharomyces cerevisiae/genética , Plasmídeos/genética , Genoma
11.
Cell Syst ; 15(4): 388-408.e4, 2024 Apr 17.
Artigo em Inglês | MEDLINE | ID: mdl-38636458

RESUMO

Genome-wide measurement of ribosome occupancy on mRNAs has enabled empirical identification of translated regions, but high-confidence detection of coding regions that overlap annotated coding regions has remained challenging. Here, we report a sensitive and robust algorithm that revealed the translation of 388 N-terminally truncated proteins in budding yeast-more than 30-fold more than previously known. We extensively experimentally validated them and defined two classes. The first class lacks large portions of the annotated protein and tends to be produced from a truncated transcript. We show that two such cases, Yap5truncation and Pus1truncation, have condition-specific regulation and distinct functions from their respective annotated isoforms. The second class of truncated protein isoforms lacks only a small region of the annotated protein and is less likely to be produced from an alternative transcript isoform. Many display different subcellular localizations than their annotated counterpart, representing a common strategy for dual localization of otherwise functionally identical proteins. A record of this paper's transparent peer review process is included in the supplemental information.


Assuntos
Proteínas de Saccharomyces cerevisiae , Saccharomyces cerevisiae , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Isoformas de Proteínas/genética , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Ribossomos/genética , Ribossomos/metabolismo , Genoma , Proteínas de Saccharomyces cerevisiae/genética , Fatores de Transcrição de Zíper de Leucina Básica
12.
Nat Commun ; 15(1): 2813, 2024 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-38561336

RESUMO

CCCTC-binding factor (CTCF), a ubiquitously expressed and highly conserved protein, is known to play a critical role in chromatin structure. Post-translational modifications (PTMs) diversify the functions of protein to regulate numerous cellular processes. However, the effects of PTMs on the genome-wide binding of CTCF and the organization of three-dimensional (3D) chromatin structure have not been fully understood. In this study, we uncovered the PTM profiling of CTCF and demonstrated that CTCF can be O-GlcNAcylated and arginine methylated. Functionally, we demonstrated that O-GlcNAcylation inhibits CTCF binding to chromatin. Meanwhile, deficiency of CTCF O-GlcNAcylation results in the disruption of loop domains and the alteration of chromatin loops associated with cellular development. Furthermore, the deficiency of CTCF O-GlcNAcylation increases the expression of developmental genes and negatively regulates maintenance and establishment of stem cell pluripotency. In conclusion, these results provide key insights into the role of PTMs for the 3D chromatin structure.


Assuntos
Genoma , Processamento de Proteína Pós-Traducional , Fator de Ligação a CCCTC/metabolismo , Diferenciação Celular , Cromatina
13.
BMC Genomics ; 25(1): 324, 2024 Apr 01.
Artigo em Inglês | MEDLINE | ID: mdl-38561675

RESUMO

Lactococcus lactis is widely applied by the dairy industry for the fermentation of milk into products such as cheese. Adaptation of L. lactis to the dairy environment often depends on functions encoded by mobile genetic elements (MGEs) such as plasmids. Other L. lactis MGEs that contribute to industrially relevant traits like antimicrobial production and carbohydrate utilization capacities belong to the integrative conjugative elements (ICE). Here we investigate the prevalence of ICEs in L. lactis using an automated search engine that detects colocalized, ICE-associated core-functions (involved in conjugation or mobilization) in lactococcal genomes. This approach enabled the detection of 36 candidate-ICEs in 69 L. lactis genomes. By phylogenetic analysis of conserved protein functions encoded in all lactococcal ICEs, these 36 ICEs could be classified in three main ICE-families that encompass 7 distinguishable ICE-integrases and are characterized by apparent modular-exchangeability and plasticity. Finally, we demonstrate that phylogenetic analysis of the conjugation-associated VirB4 ATPase function differentiates ICE- and plasmid-derived conjugation systems, indicating that conjugal transfer of lactococcal ICEs and plasmids involves genetically distinct machineries. Our genomic analysis and sequence-based classification of lactococcal ICEs creates a comprehensive overview of the conserved functional repertoires encoded by this family of MGEs in L. lactis, which can facilitate the future exploitation of the functional traits they encode by ICE mobilization to appropriate starter culture strains.


Assuntos
Lactococcus lactis , Lactococcus lactis/genética , Filogenia , Plasmídeos/genética , Proteínas/metabolismo , Genoma , Conjugação Genética , Elementos de DNA Transponíveis
14.
BMC Genomics ; 25(1): 331, 2024 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-38565992

RESUMO

BACKGROUND: The pig (Sus Scrofa) is one of the oldest domesticated livestock species that has undergone extensive improvement through modern breeding. European breeds have advantages in lean meat development and highly-productive body type, whereas Asian breeds possess extraordinary fat deposition and reproductive performance. Consequently, Eurasian breeds have been extensively used to develop modern commercial breeds for fast-growing and high prolificacy. However, limited by the sequencing technology, the genome architecture of some nascent developed breeds and the human-mediated impact on their genomes are still unknown. RESULTS: Through whole-genome analysis of 178 individuals from an Asian locally developed pig breed, Beijing Black pig, and its two ancestors from two different continents, we found the pervasive inconsistent gene trees and species trees across the genome of Beijing Black pig, which suggests its introgressive hybrid origin. Interestingly, we discovered that this developed breed has more genetic relationships with European pigs and an unexpected introgression from Asian pigs to this breed, which indicated that human-mediated introgression could form the porcine genome architecture in a completely different type compared to native introgression. We identified 554 genomic regions occupied 63.30 Mb with signals of introgression from the Asian ancestry to Beijing Black pig, and the genes in these regions enriched in pathways associated with meat quality, fertility, and disease-resistant. Additionally, a proportion of 7.77% of genomic regions were recognized as regions that have been under selection. Moreover, combined with the results of a genome-wide association study for meat quality traits in the 1537 Beijing Black pig population, two important candidate genes related to meat quality traits were identified. DNAJC6 is related to intramuscular fat content and fat deposition, and RUFY4 is related to meat pH and tenderness. CONCLUSIONS: Our research provides insight for analyzing the origins of nascent developed breeds and genome-wide selection remaining in the developed breeds mediated by humans during modern breeding.


Assuntos
Introgressão Genética , Estudo de Associação Genômica Ampla , Humanos , Animais , Suínos/genética , Genoma , Genômica/métodos , Cruzamento , Polimorfismo de Nucleotídeo Único , Sus scrofa/genética , Seleção Genética
15.
PLoS One ; 19(4): e0297987, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38578816

RESUMO

Sex identification is a common objective in molecular ecology. While many vertebrates display sexual dimorphism, determining the sex can be challenging in certain situations, such as species lacking clear sex-related phenotypic characteristics or in studies using non-invasive methods. In these cases, DNA analyses serve as valuable tools not only for sex determination but also for validating sex assignment based on phenotypic traits. In this study, we developed a bioinformatic framework for sex assignment using genomic data obtained through GBS, and having an available closely related genome assembled at the chromosome level. Our method consists of two ad hoc indexes that rely on the different properties of the mammalian heteromorphic sex chromosomes. For this purpose, we mapped RAD-seq loci to a reference genome and then obtained missingness and coverage depth values for the autosomes and X and Y chromosomes of each individual. Our methodology successfully determined the sex of 165 fur seals that had been phenotypically sexed in a previous study and 40 sea lions sampled in a non-invasive way. Additionally, we evaluated the accuracy of each index in sequences with varying average coverage depths, with Index Y proving greater reliability and robustness in assigning sex to individuals with low-depth coverage. We believe that the approach presented here can be extended to any animal taxa with known heteromorphic XY/ZW sex chromosome systems and that it can tolerate various qualities of GBS sequencing data.


Assuntos
Genoma , Cromossomos Sexuais , Humanos , Animais , Reprodutibilidade dos Testes , Genoma/genética , Cromossomos Sexuais/genética , Cromossomo Y , Genômica , Mamíferos/genética
16.
Sci Adv ; 10(14): eadl4600, 2024 Apr 05.
Artigo em Inglês | MEDLINE | ID: mdl-38579006

RESUMO

Quantifying the structural variants (SVs) in nonhuman primates could provide a niche to clarify the genetic backgrounds underlying human-specific traits, but such resource is largely lacking. Here, we report an accurate SV map in a population of 562 rhesus macaques, verified by in-house benchmarks of eight macaque genomes with long-read sequencing and another one with genome assembly. This map indicates stronger selective constrains on inversions at regulatory regions, suggesting a strategy for prioritizing them with the most important functions. Accordingly, we identified 75 human-specific inversions and prioritized them. The top-ranked inversions have substantially shaped the human transcriptome, through their dual effects of reconfiguring the ancestral genomic architecture and introducing regional mutation hotspots at the inverted regions. As a proof of concept, we linked APCDD1, located on one of these inversions and down-regulated specifically in humans, to neuronal maturation and cognitive ability. We thus highlight inversions in shaping the human uniqueness in brain development.


Assuntos
Genoma , Genômica , Animais , Humanos , Macaca mulatta , Encéfalo
17.
BMC Genomics ; 25(1): 346, 2024 Apr 05.
Artigo em Inglês | MEDLINE | ID: mdl-38580907

RESUMO

BACKGROUND: The yak (Bos grunniens) is a large ruminant species that lives in high-altitude regions and exhibits excellent adaptation to the plateau environments. To further understand the genetic characteristics and adaptive mechanisms of yak, we have developed a multi-omics database of yak including genome, transcriptome, proteome, and DNA methylation data. DESCRIPTION: The Yak Genome Database ( http://yakgenomics.com/ ) integrates the research results of genome, transcriptome, proteome, and DNA methylation, and provides an integrated platform for researchers to share and exchange omics data. The database contains 26,518 genes, 62 transcriptomes, 144,309 proteome spectra, and 22,478 methylation sites of yak. The genome module provides access to yak genome sequences, gene annotations and variant information. The transcriptome module offers transcriptome data from various tissues of yak and cattle strains at different developmental stages. The proteome module presents protein profiles from diverse yak organs. Additionally, the DNA methylation module shows the DNA methylation information at each base of the whole genome. Functions of data downloading and browsing, functional gene exploration, and experimental practice were available for the database. CONCLUSION: This comprehensive database provides a valuable resource for further investigations on development, molecular mechanisms underlying high-altitude adaptation, and molecular breeding of yak.


Assuntos
Multiômica , Proteoma , Animais , Bovinos/genética , Proteoma/genética , Genoma , Transcriptoma , Anotação de Sequência Molecular
18.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38581418

RESUMO

Following the milestone success of the Human Genome Project, the 'Encyclopedia of DNA Elements (ENCODE)' initiative was launched in 2003 to unearth information about the numerous functional elements within the genome. This endeavor coincided with the emergence of numerous novel technologies, accompanied by the provision of vast amounts of whole-genome sequences, high-throughput data such as ChIP-Seq and RNA-Seq. Extracting biologically meaningful information from this massive dataset has become a critical aspect of many recent studies, particularly in annotating and predicting the functions of unknown genes. The core idea behind genome annotation is to identify genes and various functional elements within the genome sequence and infer their biological functions. Traditional wet-lab experimental methods still rely on extensive efforts for functional verification. However, early bioinformatics algorithms and software primarily employed shallow learning techniques; thus, the ability to characterize data and features learning was limited. With the widespread adoption of RNA-Seq technology, scientists from the biological community began to harness the potential of machine learning and deep learning approaches for gene structure prediction and functional annotation. In this context, we reviewed both conventional methods and contemporary deep learning frameworks, and highlighted novel perspectives on the challenges arising during annotation underscoring the dynamic nature of this evolving scientific landscape.


Assuntos
Aprendizado Profundo , Humanos , Genoma , Algoritmos , Software , Biologia Computacional/métodos , Anotação de Sequência Molecular
19.
Sci Rep ; 14(1): 9155, 2024 04 21.
Artigo em Inglês | MEDLINE | ID: mdl-38644393

RESUMO

Deep learning models (DLMs) have gained importance in predicting, detecting, translating, and classifying a diversity of inputs. In bioinformatics, DLMs have been used to predict protein structures, transcription factor-binding sites, and promoters. In this work, we propose a hybrid model to identify transcription factors (TFs) among prokaryotic and eukaryotic protein sequences, named Deep Regulation (DeepReg) model. Two architectures were used in the DL model: a convolutional neural network (CNN), and a bidirectional long-short-term memory (BiLSTM). DeepReg reached a precision of 0.99, a recall of 0.97, and an F1-score of 0.98. The quality of our predictions, the bias-variance trade-off approach, and the characterization of new TF predictions were evaluated and compared against those produced by DeepTFactor, as well as against experimental data from three model organisms. Predictions based on our DLM tended to exhibit less variance and bias than those from DeepTFactor, thus increasing reliability and decreasing overfitting.


Assuntos
Aprendizado Profundo , Fatores de Transcrição , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Biologia Computacional/métodos , Células Procarióticas/metabolismo , Redes Neurais de Computação , Eucariotos/genética , Genoma , Células Eucarióticas/metabolismo , Sítios de Ligação
20.
Sci Rep ; 14(1): 8396, 2024 04 10.
Artigo em Inglês | MEDLINE | ID: mdl-38600096

RESUMO

Disease-causing variants have been identified for less than 20% of suspected equine genetic diseases. Whole genome sequencing (WGS) allows rapid identification of rare disease causal variants. However, interpreting the clinical variant consequence is confounded by the number of predicted deleterious variants that healthy individuals carry (predicted genetic burden). Estimation of the predicted genetic burden and baseline frequencies of known deleterious or phenotype associated variants within and across the major horse breeds have not been performed. We used WGS of 605 horses across 48 breeds to identify 32,818,945 variants, demonstrate a high predicted genetic burden (median 730 variants/horse, interquartile range: 613-829), show breed differences in predicted genetic burden across 12 target breeds, and estimate the high frequencies of some previously reported disease variants. This large-scale variant catalog for a major and highly athletic domestic animal species will enhance its ability to serve as a model for human phenotypes and improves our ability to discover the bases for important equine phenotypes.


Assuntos
Cruzamento , Genoma , Cavalos/genética , Animais , Humanos , Fenótipo , Polimorfismo de Nucleotídeo Único
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...